Due 2/2 by the start of class.
Working in Teams: Stepping Stones progress
For the coming week, teams should start the following:
- For the the two already collected measures they’ve been assigned,
start drafting the purpose – how is this measure useful – and potential
policy contexts or changes that may be relevant to understanding the
measures. These are just draft ideas to get us thinking; they don’t need
to be fully refined or fully researched/informed – we’re just thinking
about what else we might need to know to appropriately interpret and
contextualize the measures.
- Write an initial R script to download the first measure in their set
and review city excel file for background and issues,
- Begin work on a request for any measure not available online (e.g.,
identify who to contact, draft an email). We won’t reach out to reques
data yet, as some teams may need to coordinate and we want to run our
plans by our city partners.
This work should be submited to the stepping stones channel in slack
so we can better learn from one another.
Working in R
Using the police stops data from Charlottesville we shared in
the first class (also downloadable from
github), write a script that does the following:
Loads tidyverse
and janitor
and reads
in the police stops data (use the argument for column types we used in
class) and cleans the variable names.
Examine the data again (e.g., things like names and variable
structures and summaries). Based on your examination, what’s the most
common action taken in police stops in Charlottesville?
Generate a new variable for age – call it age_recode – and set
it equal to missing (NA
) if age is 0 and equal to the value
of age otherwise (hint: be sure to sae this variable to the data frame;
that is, assign the data frame to itself before piping into dplyr
functions). How frequently is the age variable missing (0)?
Practice using the dplyr functions filter, select, arrange, and
count. Generate code that answers the following questions using these
functions (hint: you won’t need all of them for an answer, but use only
some combination of them to generate the answer. This time, printing the
answer to the console is enough; you don’t need to save the result into
a named object).
Looking only at Terry stops (reason_for_stop) made by the
Charlottesville police (agency_name), what is the distribution of race
among individuals stopped?
Looking only at Terry stops (reason_for_stop) made by the
Charlottesville police (agency_name), show the race and action taken for
all observations (hint: you can add print(n = X)
to a set
of piped functions to show X values in the console).
Practice using dplyr functions mutate, group_by, summarize and
filter. Generate code that answers the following questions using a
combination from among theese functions.
Among stops where age is present (hint: where
!is.na(age_code)
or age != 0
), how many stops
are there and what is the minimum, mean, and maximum age of those
stopped by reason for the stop?
Among Terry stops, generate the proportion of actions taken;
that is, what proportion of Terry stops result in an arrest, a warning,
a citation, or no enforcement?
Save the script into the scripts folder of your learningR
folder. When complete, submit this file to me via direct message on
slack (give it a name like week2_mpc.R
as I’ll be adding
everyone’s to the same script folder in my own version of this
folder!)